Reverberation robust speech recognition by matching distributions of spectrally and temporally decorrelated features

نویسندگان

  • Kalle J. Palomaki
  • Heikki Kallasjoki
  • Kalle J. Palomäki
چکیده

This paper addresses dereverberation of speech using an unsupervised approach utilizing speech prior and taking only weak assumptions on reverberation. Our approach uses a long time context representation of reverberated speech in spectral-temporal supervectors which are decorrelated by the PCA. In the decorrelated domain supervectors are mapped from reverberant speech distribution to clean speech distribution and then to mel-spectral vectors. Mel-domain Wiener filter is applied as post processing. Our results demonstrate performance gains over the provided baseline recognizer, and show that the method can be coupled to CMLLR adaptation with cumulative benefits for clean trained models. Furthermore, we show that using dimensionality reduction coupled with the Wiener filter is better than using full dimensional PCA in representing small variance components in speech.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

Robust Features and System Fusion for Reverberation-robust Speech Recognition

Reverberation in speech degrades the performance of speech recognition systems, leading to higher word error rates. Human listeners can often ignore reverberation, indicating that the auditory system somehow compensates for reverberation degradations. In this work, we present robust acoustic features motivated by the knowledge gained from human speech perception and production, and we demonstra...

متن کامل

Feature enhancement of reverberant speech by distribution matching and non-negative matrix factorization

This paper describes a novel two-stage dereverberation feature enhancement method for noise-robust automatic speech recognition. In the first stage, an estimate of the dereverberated speech is generated by matching the distribution of the observed reverberant speech to that of clean speech, in a decorrelated transformation domain that has a long temporal context in order to address the effects ...

متن کامل

A Missing Data Approach for Robust Automatic Speech Recognition in the Presence of Reverberation

We describe a technique for robust recognition of reverberated speech using the ‘missing data’ paradigm. Modulation filtering is used to identify time-frequency regions of the speech signal which are relatively uncontaminated by reverberation and contain strong speech energy; only these ‘reliable’ acoustic features are made directly available to the recogniser. The proposed system is evaluated ...

متن کامل

Combating reverberation in large vocabulary continuous speech recognition

Reverberation leads to high word error rates (WERs) for automatic speech recognition (ASR) systems. This work presents robust acoustic features motivated by subspace modeling and human speech perception for use in large vocabulary continuous speech recognition (LVCSR). We explore different acoustic modeling strategies and language modeling techniques, and demonstrate that robust features with a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014